Self-attention in vision transformers performs perceptual grouping, not attention
نویسندگان
چکیده
Recently, a considerable number of studies in computer vision involve deep neural architectures called transformers. Visual processing these models incorporates computational that are claimed to implement attention mechanisms. Despite an increasing body work attempts understand the role mechanisms transformers, their effect is largely unknown. Here, we asked if transformers exhibit similar effects as those known human visual attention. To answer this question, revisited formulation and found despite name, computationally, perform special class relaxation labeling with similarity grouping effects. Additionally, whereas modern experimental findings reveal involves both feed-forward feedback mechanisms, purely architecture suggests cannot have same humans. quantify observations, evaluated performance family Our results suggest self-attention modules group figures stimuli based on features such color. Also, singleton detection experiment instance salient object detection, studied salience thought be utilized We generally, transformer-based assign more either distractors or ground, opposite salience. Together, our study perceptual organization feature not
منابع مشابه
Auditory perceptual grouping and attention in dyslexia.
Despite dyslexia affecting a large number of people, the mechanisms underlying the disorder remain undetermined. There are numerous theories about the origins of dyslexia. Many of these relate dyslexia to low-level, sensory temporal processing deficits. Another group of theories attributes dyslexia to language-specific impairments. Here, we show that dyslexics perform worse than controls on an ...
متن کاملInteractions between attention and perceptual grouping in human visual cortex.
Freeman et al. demonstrated that detection sensitivity for a low contrast Gabor stimulus improved in the presence of flanking, collinearly oriented grating stimuli, but only when observers attended to them. By recording visual event-related potentials (ERPs) elicited by a Gabor stimulus, we investigated whether this contextual cueing effect involves changes in the short-latency afferent visual ...
متن کاملPerceptual grouping via spatial selection in a focused-attention task
Theories of attention can be separated into those that select by location, and those that select by location-invariant representation. Experiments demonstrating stronger interference or facilitation from distractors grouped by nonspatial features with the target than ungrouped distractors have been considered as evidence for the selection of location-invariant representations. However, few stud...
متن کاملRole of attention and perceptual grouping in visual statistical learning.
Statistical learning has been widely proposed as a mechanism by which observers learn to decompose complex sensory scenes. To determine how robust statistical learning is, we investigated the impact of attention and perceptual grouping on statistical learning of visual shapes. Observers were presented with stimuli containing two shapes that were either connected by a bar or unconnected. When ob...
متن کاملTexture segregation by visual cortex: Perceptual grouping, attention, and learning
A neural model called dARTEX is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model unifies five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which sur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in computer science
سال: 2023
ISSN: ['2624-9898']
DOI: https://doi.org/10.3389/fcomp.2023.1178450